Learning HMMs with Nonparametric Emissions via Spectral Decompositions of Continuous Matrices

نویسندگان

  • Kirthevasan Kandasamy
  • Maruan Al-Shedivat
  • Eric P. Xing
چکیده

Recently, there has been a surge of interest in using spectral methods for estimating latent variable models. However, it is usually assumed that the distribution of the observations conditioned on the latent variables is either discrete or belongs to a parametric family. In this paper, we study the estimation of an m-state hidden Markov model (HMM) with only smoothness assumptions, such as Hölderian conditions, on the emission densities. By leveraging some recent advances in continuous linear algebra and numerical analysis, we develop a computationally efficient spectral algorithm for learning nonparametric HMMs. Our technique is based on computing an SVD on nonparametric estimates of density functions by viewing them as continuous matrices. We derive sample complexity bounds via concentration results for nonparametric density estimation and novel perturbation theory results for continuous matrices. We implement our method using Chebyshev polynomial approximations. Our method is competitive with other baselines on synthetic and real problems and is also very computationally efficient.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hilbert Space Embeddings of Hidden Markov Models

Hidden Markov Models (HMMs) are important tools for modeling sequence data. However, they are restricted to discrete latent states, and are largely restricted to Gaussian and discrete observations. And, learning algorithms for HMMs have predominantly relied on local search heuristics, with the exception of spectral methods such as those described below. We propose a nonparametric HMM that exten...

متن کامل

A Bayesian Nonparametric Model for Spectral Estimation of Metastable Systems

The identification of eigenvalues and eigenfunctions from simulation or experimental data is a fundamental and important problem for analysis of metastable systems, because the dominant spectral components usually contain a lot of essential information of the metastable dynamics on slow timescales. It has been shown that the dynamics of a strongly metastable system can be equivalently described...

متن کامل

Implementing spectral methods for hidden Markov models with real-valued emissions

Hidden Markov models (HMMs) are widely used statistical models for modeling sequential data. The parameter estimation for HMMs from time series data is an important learning problem. The predominant methods for parameter estimation are based on local search heuristics, most notably the expectation–maximization (EM) algorithm. These methods are prone to local optima and oftentimes suffer from hi...

متن کامل

Model-based Word Embeddings from Decompositions of Count Matrices

This work develops a new statistical understanding of word embeddings induced from transformed count data. Using the class of hidden Markov models (HMMs) underlying Brown clustering as a generative model, we demonstrate how canonical correlation analysis (CCA) and certain count transformations permit efficient and effective recovery of model parameters with lexical semantics. We further show in...

متن کامل

Spectral Learning of Binomial HMMs for DNA Methylation Data

We consider learning parameters of Binomial Hidden Markov Models, which may be used to model DNA methylation data. The standard algorithm for the problem is EM, which is computationally expensive for sequences of the scale of the mammalian genome. Recently developed spectral algorithms can learn parameters of latent variable models via tensor decomposition, and are highly efficient for large da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016